Quality control

Cell Ranger identified 17,356 cells.

The cutoff used by cellranger to identify cell-associated barcodes did not appear as accurate for sample AF2. To account for this AF2 cells were initially filtered to remove cells with <800 UMIs.

Cells were filtered to only include those with >250 and <6,000 detected genes and <20% mitochondrial counts. These filtering steps were not performed on CHIKV-high cells due to the possibility that the virus is affecting host transcription. 11,978 cells remained for downstream analysis after filtering.

Summary


The number of cells filtered from each sample is shown below.


Cell Ranger

Cell Ranger metrics are shown for each sample.


Scatter plots


UMAPs

UMAP projections are shown for each replicate after filtering. Cells that appear to have a high percentage of mitochondrial reads are ones that harbor viral RNA and were not subjected to the final filtering steps.


After filtering cells, the raw counts for each gene were divided by the total counts for the cell, multiplied by a scaling factor (10,000), and log-transformed (NormalizeData). The top 2000 genes that show the most cell-to-cell variation were then identified (FindVariableFeatures) and the data were scaled to prevent highly expressed genes from biasing the downstream analysis (ScaleData). For clustering and visualizing differences between cells, PCA and UMAP were used to reduce the dimensionality of the dataset (RunPCA, RunUMAP).

UMAP projections are shown below with cells colored by sample ID, the mock- and CHIKV-infected samples show strong overlap.

UMAP projections are shown for cells divided by replicate, colors correspond to sample ID as shown above. The biological replicates show strong overlap and very similar overall structure, suggesting that batch effects are not impacting gene expression patterns.


UMAP projection shows 8 hpi and 24 hpi samples with cells colored by sample ID.


UMAP projection shows 8 hpi and 24 hpi samples with cells colored by sample ID. Cells from the 24 hpi samples are shown in grey.

UMAP projections are shown for 8 hpi cells divided by sample ID, 24 hpi cells are shown in grey.




Cell type annotation

To identify cell clusters, a k-nearest neighbors graph was generated using the first 40 principal components (FindNeighbors). This graph was then used to iteratively group cells together using the Louvain method (FindClusters).

Broad cell types were first assigned by comparing gene expression patterns for each cluster with data available from Immgen (clustify). UMAPs below show cell type annotations for different clustering resolutions, the number of distinct clusters is shown in parenthesis.


UMAP projections are shown for the selected cell type annotations (clustering resolution 5) with cells colored by type. The fraction of cells belonging to each type is shown on the right.




Cell type markers

Infection vs mock cell types

Differentially expressed genes were identified for each cell type by comparing replicates from the mock- and CHIKV-infected groups. The top upregulated genes in the CHIKV group are shown below. GO terms for marker genes are shown at the bottom. This analysis was only performed for cell types that were identified in all samples.

B cells

12 marker genes and 0 GO terms were identified.




Endothelial cells

25 marker genes and 19 GO terms were identified.




Fibroblasts

6 marker genes and 0 GO terms were identified.




Stromal cells (DN)

74 marker genes and 184 GO terms were identified.




T cells

11 marker genes and 0 GO terms were identified.




unassigned

14 marker genes and 0 GO terms were identified.




LEC subtypes

To classify LEC subtypes, cells were filtered to only include LECs and re-clustered. UMAPs are shown on the left with cells colored by sample ID. UMAPs shown on the right are colored by sample ID and divided by replicate ID.


LEC subtypes were assigned using gene expression data for LEC subsets identified previously Xiang et al.. UMAPs below show LEC subtype annotations for different clustering resolutions, the number of distinct clusters is shown in parenthesis.


UMAP projections are shown for the selected LEC annotations (clustering resolution 5) with cells colored by subtype, cells that are not LECs are shown in light grey. The fraction of cells belonging to each subtype is shown on the right.


To assess the accuracy of LEC annotations, the subtype assignments were compared back to the reference data (“ref_type”). The correlation with the reference datasets is shown below.




LEC markers

UMAP projection shows Marco expression.


Infection vs mock cell types

Differentially expressed genes were identified for each cell type by comparing replicates from the mock- and CHIKV-infected groups. The top upregulated genes in the CHIKV group are shown below. GO terms for marker genes are shown at the bottom. This analysis was only performed for cell types that were identified in all samples.

BEC

21 marker genes and 17 GO terms were identified.




cLEC

56 marker genes and 17 GO terms were identified.




Collecting

11 marker genes and 0 GO terms were identified.




fLEC

12 marker genes and 0 GO terms were identified.




Ptx3_LEC

8 marker genes and 0 GO terms were identified.




tzLEC

5 marker genes and 0 GO terms were identified.




unassigned

1 marker genes and 0 GO terms were identified.




Valve

5 marker genes and 0 GO terms were identified.




Non-endothelial SC subtypes

To classify non-endothelial SC subtypes, cells were filtered to only include stromal cells and re-clustered. UMAPs are shown on the left with cells colored by sample ID. UMAPs shown on the right are colored by sample ID and divided by replicate ID.


Non-endothelial SC subsets were assigned using gene expression data published previously Rodda et al.. UMAPs below show annotations for different clustering resolutions, the number of distinct clusters is shown in parenthesis.


UMAP projections are shown for the selected non-endothelial SC annotations (clustering resolution 5) with cells colored by subtype, other cells are shown in light grey. The fraction of cells belonging to each subtype is shown on the right.


To assess the accuracy of these cell annotations, the subtype assignments were compared back to the reference data (“ref_type”). The correlation with the reference datasets is shown below.




Non-endothelial SC markers

Infection vs mock cell types

Differentially expressed genes were identified for each cell type by comparing replicates from the mock- and CHIKV-infected groups. The top upregulated genes in the CHIKV group are shown below. GO terms for marker genes are shown at the bottom. This analysis was only performed for cell types that were identified in all samples.

Ccl19hi TRC

6 marker genes and 0 GO terms were identified.




Ccl19lo TRC

15 marker genes and 0 GO terms were identified.




CD34+ SC

34 marker genes and 33 GO terms were identified.




MRC

7 marker genes and 0 GO terms were identified.




Nr4a1+ SC

8 marker genes and 0 GO terms were identified.




PvC

37 marker genes and 32 GO terms were identified.




unassigned

14 marker genes and 0 GO terms were identified.




CHIKV RNA

To identify cells containing CHIKV RNA, reads were aligned to an mm10 reference containing the CHIKV genome. Viral counts are shown below on a UMAP projection.


To identify cells with high amounts of viral RNA, cells were first filtered to only include those with >5 CHIKV counts (6 cells). K-means clustering was then used to divide each sample into CHIKV low and high populations. CHIKV counts are shown below for each sample. Cells are colored by the CHIKV low/high groupings.